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AMENDED CLAIMS 

[received by the International Bureau on 12 May 2000 (12.05.00); 
original claims 1, 17 and 33 amended; remaining claims unchanged (3 pages)] 



1 L A system for generating a description record from multimedia information, 

2 comprising: 

3 (a) at least one multimedia information input interface receiving said 

4 multimedia information; 

5 (b) a computer processor, coupled to said at least one multimedia 

6 information input interface, receiving said multimedia information 

7 therefrom, processing said multimedia information by performing 

8 object extraction processing to generate multimedia object 

9 descriptions from said multimedia information, and processing said 
] 0 generated multi media object descriptions by object hierarchy 

1 1 processing to generate multimedia object hierarchy descriptions 

12 indicative of an organization of said object descriptions , wherein at 

1 3 least one description record including said multimedia object 

14 descriptions and said multimedia object hierarchy descriptions is 
* 5 generated for content embedded within said multimedia 

16 information; and 

17 (c) a data storage system, operatively coupled to said processor, for 
* 8 storing said at least one description record. 



1 2. The system of claim 1 , wherein said multimedia information comprises 

2 image information, said multimedia object descriptions comprise image object 

3 descriptions, and said multimedia object hierarchy descriptions comprise image 

4 object hierarchy descriptions. 

1 3. The system of claim 2, wherein said object extraction processing comprises: 

2 (a) image segmentation processing to segment each image in said 

3 image information into regions within said image; and 
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4 (b) feature extraction processing to generate one or more feature 

5 descriptions for one or more of said regions; 

6 whereby said generated object descriptions comprise said one or more feature 

7 descriptions for one or more of said regions. 

1 4. The system of claim 3, wherein, said one or more feature descriptions are 

2 selected from the group consisting of text annotations, color, texture, shape, size, 

3 and position. 



1 5. The system of claim 2, wherein said object hierarchy processing comprises 

2 physical object hierarchy organization to generate physical object hierarchy 

3 descriptions of said image object descriptions that are based on spatial 

4 characteristics of said objects, such that said image object hierarchy descriptions 

5 comprise physical descriptions. 

1 6. The system of claim 5, wherein, said object hierarchy processing further 

2 comprises logical object hierarchy organization to generate logical object hierarchy 

3 descriptions of said image object descriptions that are based on semantic 

4 characteristics of said objects, such that said image object hierarchy descriptions 

5 comprise both physical and logical descriptions. 

1 7. The system of claim 6, wherein said object extraction processing comprises: 

2 (a) image segmentation processing to segment each image in said 

3 image information into regions within said image; and 

4 (b) feature extraction processing to generate object descriptions for one 

5 or more of said region; 

6 and wherein said physical hierarchy organization and said logical hierarchy 

7 organization, generate hierarchy descriptions of said object descriptions for said 

8 one or more of said regions. 
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8. The system of claim 7, further comprising an encoder receiving said image 
object hierarchy descriptions and said image object descriptions, and encoding said 
image object hierarchy descriptions and said image object descriptions into 
encoded description information, wherein said data storage system is operative to 
store said encoded description information as said at least one description record. 

9. The system of claim 1 , wherein said multimedia information comprises 
video information, said multimedia object descriptions comprise video object 
descriptions including both event descriptions and object descriptions, and said 
multimedia hierarchy descriptions comprise video object hierarchy descriptions 
including both event hierarchy descriptions and object hierarchy descriptions. 

10. The system of claim 9, wherein said object extraction processing comprises: 

(a) temporal video segmentation processing to temporally segment said 
video information into one or more video events or groups of video 
events and generate event descriptions for said video events, 

(b) video object extraction processing to segment said one or more 
video events or groups of video events into one or more regions, 
and to generate object descriptions for said regions; and 

(c) feature extraction processing to generate one or more event feature 
descriptions for said one or more video events or groups of video 
events, and one or more object feature descriptions for said one or 
more regions; 

wherein said generated video object descriptions include said event feature 
descriptions and said object descriptions. 

1 1 . The system of claim 1 0, wherein said one or more event feature 
descriptions are selected from the group consisting of text annotations, shot 
transition, camera motion, time and key frame, and wherein said one or more 
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4 object feature descriptions are selected from the group consisting of color, texture, 

5 shape, size, position, motion, and time. 

6 12. The system of claim 9, wherein said object hierarchy processing comprises 

7 physical event hierarchy organization to generate physical event hierarchy 

8 descriptions of said video object descriptions that are based on temporal 

9 characteristics of said video objects, such that said video hierarchy descriptions 
10 comprise temporal descriptions. 

1 13. The system of claim 12, wherein said object hierarchy processing further 

2 comprises logical event hierarchy organization to generate logical event hierarchy 

3 descriptions of said video object descriptions that are based on semantic 

4 characteristics of said video objects, such that said hierarchy descriptions comprise 

5 both temporal and logical descriptions. 

1 14. The system of claim 13, wherein said object hierarchy processing further 

2 comprises physical and logical object hierarchy extraction processing, receiving 

3 said temporal and logical descriptions and generating object hierarchy descriptions 

4 for video objects embedded within said video information, such that said video 

5 hierarchy descriptions comprise temporal and logical event and object descriptions. 

1 15. The system of claim 14, wherein said object extraction processing 

2 comprises: 



(a) temporal video, segmentation processing to temporally segment said 
video information into one or more video events or groups of video 
events and generate event descriptions for said video events, 

(b) video object extraction processing to segment said one or more 
video events or groups of video events into one or more regions, 
and to generate object descriptions for said regions; and 



# 



WO 00/28440 



PCT/US99/26125 



9 
10 
11 
12 
13 
14 
15 
16 
17 
18 

s n 

m i 

: ; f| 

B 2 

U 5 

(9 

Sd 1 

5 is;? 2 

3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 



(c) feature extraction processing to generate one or more event feature 
descriptions for said one or more video events or groups of video 
events, and one or more object feature descriptions for said one or 
more regions; 

wherein said generated video object descriptions include said event feature 
descriptions and said object descriptions, and wherein said physical event hierarchy 
organization and said logical event hierarchy organization generate hierarchy 
descriptions from said event feature descriptions, and wherein said physical object 
hierarchy organization and said logical object hierarchy organization generate 
hierarchy descriptions from said object feature descriptions 

1 6. The system of claim 15, further comprising an encoder receiving said video 
object hierarchy descriptions and said video object descriptions, and encoding said 
said video object hierarchy descriptions and said video object descriptions into 
encoded description information, wherein said data storage system is operative to 
store said encoded description information as said at least one description record. 

17. A method for generating a description record from multimedia information, 
comprising the steps of: 

(a) receiving said multimedia information; 

(b) processing said multimedia information by performing object 
extraction processing to generate multimedia object descriptions 



(c) processing said generated multimedia object descriptions by object 
hierarchy processing to generate multimedia object hierarchy 
descriptions indicative of an organization of said object 
descriptions, wherein at least one description record including said 
multimedia object descriptions and said multimedia object hierarchy 
descriptions is generated for content embedded within said 
multimedia information; and 

(d) storing said at least one description record. 
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1 1 8. The method of claim 1 7, wherein said multimedia information comprises 

2 image information, said multimedia object descriptions comprise image object 

3 descriptions, and said multimedia object hierarchy descriptions comprise image 

4 object hierarchy descriptions. 

1 19. The method of claim 2, wherein said object extraction processing step 

2 comprises the sub-steps of: 

3 (a) image segmentation processing to segment each image in said 

4 image information into regions within said image; and 

5 (b) feature extraction processing to generate one or more feature 

6 descriptions for one or more of said regions; 

7 whereby said generated image object descriptions comprise said one or more 

8 feature descriptions for one or more of said regions. 

1 20. The method of claim 19, wherein, said one or more feature descriptions are 

2 selected from the group consisting of text annotations, color, texture, shape, size, 

3 and position. 

1 21 . The method of claim 1 8, wherein said step of object hierarchy processing 

2 includes the sub-step of physical object hierarchy organization to generate physical 

3 object hierarchy descriptions of said image object descriptions that are based on 

4 spatial characteristics of said objects, such that said image hierarchy descriptions 

5 comprise physical descriptions. 

1 22. The method of claim 21, said step of object hierarchy processing further 

2 includes the sub-step of logical object hierarchy organization to generate logical 

3 object hierarchy descriptions of said image object descriptions that are based on 

4 semantic characteristics of said objects, such that said image object hierarchy 
descriptions comprise both physical and logical descriptions. 
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23. The method of claim 22, wherein said step of object extraction processing 
further includes the sub-steps of: 

(a) image segmentation processing to segment each image in said 
image information into regions within said image; and 

(b) feature extraction processing to generate object descriptions for one 
or more of said region; 

and wherein said physical object hierarchy organization sub-step and said logical 
object hierarchy organization sub-step generate hierarchy descriptions of said 
object descriptions for said one or more of said regions. 

24. The method of claim 24, further comprising the step of encoding said 
image object descriptions and said image object hierarchy descriptions into 
encoded description information prior to^said data storage step. 

25. The method of claim 1 7, wherein said multimedia information comprises 
video information, said multimedia object descriptions comprise video object 
descriptions including both event descriptions and object descriptions, and said 
multimedia hierarchy descriptions comprise video object hierarchy descriptions 
including both event hierarchy descriptions and object hierarchy descriptions. 

26. The method of claim 25, wherein said step of object extraction processing 
comprises the sub-steps of: 

(a) temporal video segmentation processing to temporally segment said 
video information into one or more video events or groups of video 
events and generate event descriptions for said video events, 

(b) video object extraction processing to segment said one or more 
video events or groups of video events into one or more regions, 
and to generate object descriptions for said regions; and 
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(c) 



feature extraction processing to generate one or more event feature 
descriptions for said one or more video events or groups of video 
events, and one or more object feature descriptions for said one or 
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11 



12 



more regions; 



1 3 wherein said generated video object descriptions include said event feature 

14 descriptions and said object descriptions. 

1 27. The method of claim 26, wherein said one or more event feature 

2 descriptions are selected from the group consisting of text annotations, shot 

3 transition, camera motion, time and key frame, and wherein said one or more 

4 object feature descriptions are selected from the group consisting of color, texture, 

5 shape, size, position, motion, and time. 

6 28. The method of claim 25, wherein, said step of object hierarchy processing 

7 includes the sub-step of physical event hierarchy organization to generate physical 

8 event hierarchy descriptions of said video object descriptions that are based on 

9 temporal characteristics of said video objects, such that said video hierarchy 
10 descriptions comprise temporal descriptions. 

1 29. The method of claim 28, wherein said step of object hierarchy processing 

2 further includes the sub-step of logical event hierarchy organization to generate 

3 logical event hierarchy descriptions of said video object descriptions that are based 

4 on semantic characteristics of said video objects, such that said hierarchy 

5 descriptions comprise both temporal and logical descriptions. 

1 30. The method of claim 29, wherein said step of object hierarchy processing 

2 further comprises the sub-step physical and logical object hierarchy extraction 

3 processing, receiving said temporal and logical descriptions and generating object 

4 hierarchy descriptions for video objects embedded within said video information, 



WO 00/28440 



5 such that said video hierarchy descriptions comprise temporal and logical event and 

6 object descriptions.. 

1 31. The method of claim 30, wherein said step of object extraction processing 

2 comprises the sub-steps of: 

3 (a) temporal video segmentation processing to temporally segment said 

4 video information into one or more video events or groups of video 

5 events and generate event descriptions for said video events, 

6 (b) video object extraction processing to segment said one or more 

7 video events or groups of video events into one or more regions, 

8 and to generate object descriptions for said regions; and 

9 (c) feature extraction processing to generate one or more event feature 

1 0 descriptions for said one or more video events or groups of video 

1 1 events, and one or more object feature descriptions for said one or 

12 more regions; 

13 wherein said generated video object descriptions include said event feature 

14 descriptions and said object descriptions, and wherein said physical event hierarchy 

1 5 organization and said logical event hierarchy organization generate hierarchy 

16 descriptions from said event feature descriptions, and wherein said physical object 

1 7 hierarchy organization and said logical object hierarchy organization generate 

18 hierarchy descriptions from said object feature descriptions. 

1 32. The method of claim 15, further comprising the step of encoding said video 

2 object descriptions and said video object hierarchy descriptions into encoded 

3 description information prior to said data storage step. 
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33. A computer readable media containing digital information with at least one 
multimedia description record describing multimedia content for corresponding 
multimedia information, the description record comprising: 
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4 (a) one or more multimedia object descriptions describing 

5 corresponding multimedia objects; 

6 (b) one or more features characterizing each of said multimedia object 

7 descriptions; and 
one or more multimedia object hierarchy descriptions indicative of 
an organization of said object descriptions, if any, relating at least a 
portion of said one or more multimedia objects in accordance with 

* 1 one or more characteristics. 

1 34. The computer readable media of claim 33, wherein said multimedia 

2 information comprises image information, said multimedia objects comprise image 

3 objects, said multimedia object descriptions comprise image object descriptions, 

4 and said multimedia object hierarchy descriptions comprise image object hierarchy 

5 descriptions. 

1 35. The computer readable media of claim 34, wherein, said one or more 

2 features are selected from the group consisting of text annotations, color, texture, 

3 shape, size, and position. 

1 36. The computer readable media of claim 34, wherein said image object 

2 hierarchy descriptions comprise physical object hierarchy descriptions of said 

3 image object descriptions based on spatial characteristics of said image objects. 

1 37. The computer readable media of claim 36, wherein, said image object 

2 hierarchy descriptions further comprises logical object hierarchy descriptions of 

3 said image object descriptions based on semantic characteristics of said image 

4 objects. 



38. The computer readable media of claim 33, wherein said multimedia 
information comprises video information, said multimedia objects comprise events 
and video objects, said multimedia object descriptions comprise video object 
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4 descriptions including both event descriptions and object descriptions ; said features 

5 comprise video event features and video object features, and said multimedia 

6 hierarchy descriptions comprise video object hierarchy descriptions including both 

7 event hierarchy descriptions and object hierarchy descriptions. 

1 39. The computer readable media of claim 38, wherein, said one or more event 

2 feature descriptions are selected from the group consisting of text annotations, shot 

3 transition, camera motion, time and key frame, and wherein said one or more 

4 object feature descriptions are selected from the group consisting of color, texture, 

5 shape, size, position, motion, and time.. 

1 40. The computer readable media of claim 38, wherein said event hierarchy 

2 descriptions comprise one or more physical hierarchy descriptions of said events 

3 based on temporal characteristics. 

1 41 . The computer readable media of claim 40, wherein said event hierarchy 

2 descriptions further comprise one or more logical hierarchy descriptions, of said 

3 events based on semantic characteristics. 

1 42. The computer readable media of claim 38, wherein said object hierarchy 

2 descriptions comprise one or more physical hierarchy descriptions of said objects 

3 based on temporal characteristics. 

1 43. The computer readable media of claim 39, wherein said object hierarchy 

2 descriptions further comprise one or more logical hierarchy descriptions, of said 

3 objects based on semantic characteristics. 



